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DETAILED ACTION 

1 . This communication is in response to the Application filed on 1 1/19/2003. Claims 
1-55 are pending and have been examined. 

Specification 

2. The title of the invention is not descriptive. A new title is required that is clearly 
indicative of the invention to which the claims are directed. 

Claim Objections 

3. Claims 3 and 28 are objected to because of the following informalities: "a text 
passage" should be "the text passage" in line 2. Appropriate correction is required. 

4. Claims 5 and 30 are objected to because of the following informalities: "a text 
passage" should be "the text passage" in line 2. Appropriate correction is required. 

5. Claims 4, 9-15, 29, and 34-40 are objected to as being based upon an objected 
to claim. 

Claim Rejections - 35 USC § 101 

6. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

the claimed invention is directed to non-statutory subject matter. 

Claims 26-40 are directed toward non-statutory subject matter. 
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The term "computer usable storage medium" is not fully explained in the 
Applicant's specification. Hence, the term computer usable medium has also been 
interpreted to include signals and carrier waves, which are non-statutory. See MPEP 
2106.01 [R-5]. 



Claim Rejections - 35 USC §112 

7. The following is a quotation of the first paragraph of 35 U.S.C. 112: 

The specification shall contain a written description of the invention, and of the manner and process of 
making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
art to which it pertains, or with which it is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the inventor of carrying out his invention. 

8. Claim 16 is rejected under 35 U.S.C. 112, first paragraph, because the 
specification, while being enabling for "text pattern recognition rule", does not 
reasonably provide enablement for the cited limitation as a result of undue breadth. The 
specification does not enable any person skilled in the art to which it pertains, or with 
which it is most nearly connected, to make and use the invention commensurate in 
scope with these claims. The claim is a single means claim which covers every 
structure for achieving the stated property see MPEP21 64.08(a). 

9. The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

10. Claims 4, 29, and 44 rejected under 35 U.S.C. 112, second paragraph, as being 
indefinite for failing to particularly point out and distinctly claim the subject matter which 
applicant regards as the invention. The limitation "other interesting attributes of the text" 
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is unclear as to what the Applicant is referring to. Hence, for the purposes of compact 
prosecution the limitation was interpreted to mean any additional attribute. 

1 1 . Claims 8, 25, 33, and 48 are rejected under 35 U.S.C. 112, second paragraph, 

» 

as being indefinite for failing to particularly point out and distinctly claim the subject 
matter which applicant regards as the invention. The limitation "constituent attributes 
assigned yes-no values to patterns of base tokens, where the entire pattern is 
considered to be a single constituent with respect to some annotation value" is unclear 
as to what the Applicant is referring to. Hence, for the purposes of compact prosecution 
the limitation was interpreted to mean constituent attributes assigned to base tokens, 
where the series of base tokens identify a pattern. 

Claim Rejections - 35 USC § 102 

12. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 

form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

13. Claims 1-4, 8, 9, 10, 12, 14-16, 18-21, 25-30, 33-35, 37, 39, 41-44, 48-50, 52, 54 
and 55 are rejected under 35 U.S.C. 102(b) as being anticipated by Cunningham et al. 
(Developing Language Processing Component with GATE (a User Guide), 2001-2002)). 

As to claims 1 , 26, and 41 , Cunningham et al. discloses a fact extraction tool set 
for extracting information from a document, comprising: 
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means for annotating a text (see sect. 6.1, 6.4, and 6.5) (e.g. The text is 
being annotated by part of speech, semantics, tokenized, among others); and 

means for extracting facts from the annotated text (see page 104, 6.8) 
(e.g. The example is extracting the phrase 800,000, US dollars from the text 
using the annotations). 

As to claims 2, 27, and 42, Cunningham et al. discloses wherein, 

the means for annotating a text comprises means for assigning syntactic 
and semantic attributes to a text passage (see sect. 6.4 and 6.5, and page 62, 
sect. 4,4.2, last paragraph) (e.g. From the cited sections it is seen that the text 
passage is annotated by the semantics and syntax of each word) by at least one 
of parsing the text passage (see page 68, sect. 4.5.2, 1 st paragraph) and 
applying text annotation processes (see sect. 6.4 and 6.5)(e.g. It is evident that 
semantics and syntactic elements are annotated and see sect. 6.1, which 
describes orthography annotations) other than parsing the text passage. 

As to claims 3, 28, and 43, Cunningham et al. discloses wherein, 

the means for assigning syntactic and semantic attributes to a text 
passage (see sect. 6.4 and 6.5, and page 62, sect. 4.4.2, last paragraph) 
comprises means for breaking the text passage into its base tokens and 
annotating the base tokens and patterns of base tokens (see sect 6.1 , page 94, 
1 st paragraph) (e.g. It is implied that the annotations will be made to base tokens 
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as well as patterns of base tokens depending on relationships and coreferences.) 
with a number of orthographic (see sect 6.1, page 94, 1 st paragraph), syntactic 
(see sect. 4.4.2, last paragraph), semantic (see sect. 6.5), pragmatic (see sect. 
6.7.1 , 1 st paragraph) (e.g. The applicant refers to pragmatic as being identifying 
quotations, see Applicants specification, page 23, line 4) and dictionary-based 
attributes (see sect. 6.6.2 and see 6.2) (e.g. A table is used to determine id 
strings are of the same entity and the latter citation refers to names and cities). 

As to claims 4, 21, 29, and 44, Cunningham et al. discloses wherein, 

the attributes include tokenization (see sect. 6.1), text normalization (see , 
part of speech tags (see sect. 6.4.), sentence boundaries (see sect. 6.3), parse 
trees (see page 62, sect. 4.42^ last paragraph-page 63, first three lines) (e.g. It is 
seen that annotations can be represented in hierarchical representation of a 
parse tree), semantic attribute tagging (see. sect. 6.5) and other interesting 
attributes of the text (see sect. 6.6). 

As to claims 8, 25, 33, and 48, Cunningham discloses, wherein the means for 
breaking the text passage into its base tokens and annotating the base tokens 
and patterns of base tokens comprises independent annotators, wherein the 
annotators are of three types comprising: 

token attributes, which have a one-per-base-token alignment, where for 
the attribute type represented, there is an attempt to assign an attribute to each 
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base token (see sect. 6.1, 6.1.2) (e.g. From the cited sections, once the text is 
broken into tokens, the attributes are identified, regarding punctuation, symbols, 
space, number, and orthographic type).; 

constituent attributes assigned yes-no values to patterns of base tokens, 
where the entire pattern is considered to be a single constituent with respect to 
some annotation value (see page 62, last paragraph, and page 63, 1 st three 
lines, and table 4.1.) (e.g. From the tokenization, pos is used and tagged. 
Further, the annotations can be used to how the hierarchical representation of 
the text. Further, it is seen that the all of the tokens represent a pattern 
associated with the sentence.; 

and links, which assign common identifiers to coreferring and other related 
patterns of base tokens (see sect. 6.6) (e.g. In this cited section relations 
between identities are found for match names (see sect. 6.7) (e.g. pronominal 
coreference). Hence, it is implied by the reference that identifiers are used to 
relate associated pronouns (See page 101, "Pronoun resolution")) . 

As to claims 9, 10, 34, 35 ,49, and 50 Cunningham et al. discloses wherein, 

the means for annotating a text further comprises means for associating 
all annotations assigned to a particular piece of text (see page 81 , 2 nd paragraph, 
three bullets) (e.g. From the cited section it is evident that a pattern is specified 
by specifying attributes to the tokens and then specifying an annotation based 
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upon previous assignment), with the base tokens for that text to generate aligned 
annotations (e.g. This is implied when matching patterns.) 

As to claims 12, 16, 37, and 52, Cunningham discloses wherein, 

the means for identifying and extracting potentially interesting pieces of 
information text (see page 104, 6.8) comprises at least one text pattern 
recognition rule written in a rule-based information extraction language (see page 
81, 2 nd paragraph, three bullets and sect. 6.1.1) (e.g. From the cited section it is 
evident that a pattern is specified by specifying attributes to the tokens and then 
specifying an annotation based upon previous assignment. LHS and RHS rules 
are used),wherein the at least one text pattern recognition rule queries for at 
least one of literal text, attributes, and relationships found in the aligned 
annotations to define the facts to be extracted (see page 81, last two paragraphs, 
and pages 82 and 83) (e.g. It is evident that from the input, attributes or 
annotations are specified and the latter citation is shown as a variety of data 
formats are possible and are looked upon in an existing list, which are compared 
(queried)). 

As to claims, 14,18, 29, and 54, Cunningham etal. discloses wherein, 

the at least one text pattern recognition rule comprises a pattern that 
describes the text of interest (see page 82, 3 rd paragraph, and rule below) (e.g. 
From the cited portion a definition of GazLocation is given for a portion of the 
pattern. This is an example of a rule.), a label that names the pattern for testing 
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and debugging purposes (see page 81, 2 nd paragraph and 2 nd bullet) (e.g. A 
label; for debugging can be set in order to see any conflicts.), underneath); and 
an action that indicates what should be done in response to a successful match 
(see page 142, numeral2, subnumeral 2) (e.g. The algorithm in the cited section 
is used in the JAPE rules, which is a finite state machine) . 

As to claims 15, 19, 40, and 55, Cunningham et al. discloses wherein, 

wherein the means for identifying and extracting potentially interesting 
pieces of information further comprises at least one auxiliary, definition statement 
used to name and define a fragment of a pattern (see page 84, 1 st and 2 nd 
paragraph) (e.g. The auxiliary definition or label is assigned to the year based on 
the pattern of word in or by found in the text).. 



As to claim 20, Cunningham et al. discloses wherein, a text annotation tool 
comprising: 

means for assigning syntactic and semantic attributes to a text passage 
(see sect. 6.4 and 6.5, and page 62, sect. 4.4.2, last paragraph) comprises 
means for breaking the text passage into its base tokens and annotating the 
base tokens and patterns of base tokens (see sect 6.1 , page 94, 1 st paragraph) 
(e.g. It is implied that the annotations will be made to base tokens as well as 
patterns of base tokens depending on relationships and coreferences.) with a 
number of orthographic (see sect 6.1 , page 94, 1 st paragraph), syntactic (see 
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sect. 4.4.2, last paragraph), semantic (see sect. 6.5), pragmatic (see sect. 6.7.1, 
1 st paragraph) (e.g. The applicant refers to pragmatic as being identifying 
quotations, see Applicants specification, page 23, line 4) and dictionary-based 
attributes (see sect. 6.6.2 and see 6.2) (e.g. A table is used to determine id 
strings are of the same entity and the latter citation refers to names and cities). 

means for associating all annotations assigned to a particular piece of text 
with the base tokens for that text to generate aligned annotations, (see page 81 , 
2 nd paragraph, three bullets) (e.g. From the cited section it is evident that a 
pattern is specified by specifying attributes to the tokens and then specifying an 
annotation based upon previous assignment), with the base tokens for that text 
to generate aligned annotations (e.g. This is implied when matching patterns.) 



Claim Rejections - 35 USC § 103 

14. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

15. Claims 5-7, 22-24, 30-32, and 45-47 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Cunnigham et al. in view of Broder et al. (US 2004/0243645). 



As to claims 5, 22, 30, and 45, Cunningham et al. discloses wherein, 
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the means for assigning syntactic and semantic attributes to a text 
passage (see sect. 6.5. and sect 4.4.2, last paragraph) . 

However, Cunningham etal. does not specifically disclose the comprising 
of independent annotators. 

Broder et al. discloses the use of independent annotators (see [0153]) 
(e.g. It is seen that independent annotations are used for each type of word 
pairs.) 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have combined the fact extraction and annotation 
taught by Cunningham et al. with the independent annotation taught by Broder et 
al. The motivation to have use independent annotators is to resolve the issue of 
overlapping annotations that occurs in nested XML (see Broder et al. [0128]). 

As to claims 6, 23, 31 , and 46, Cunningham et al. discloses 

the use of XML for representing annotated text (see page 60, sect. 4.4.1 , 
1 st paragraph). 

As to claims 7, 24, 32, and 47, Broder et al. discloses 

means for resolving conflicting annotation boundaries in the annotated text 
to produce well-formed XML from the results of independent annotators (see 
[0153]) (e.g. From the cited sections it is seen that the boundaries of the word 
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pairs are resolved from the previous conflict for differentiation by using separate 

t 

annotations). 

16. Claims 11, 36, and 51 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Cunningham et al. in view of Marcus et al. ("The PENN Treebank 
Annotating Predicate Argument Structure", 1994). 

As to claim 1 1 , 36, and 51 , Cunningham et al. discloses wherein, 

the means for identifying and extracting potentially interesting pieces of 
information comprises means for recognizing both true left and right constituent 
attributes (see sect. 6.1 .1 and page 81 , 1 st paragraph) (e.g. It is seen that a left 
and right attributes are recognized by the tokeniser. Further it is admitted in the 
Applicant's background that many pattern recognition languages have rules that 
process text in left to right fashion(see Applicant's Specification, page 3, lines 2- 
3)) and constituent attributes (see page 63, 1 st paragraph). 

However, Cunningham et al. does not specifically disclose the 
identification of non-contiguous attributes. 

Marcus et al. does disclose the identification of non-contiguous attributes 
(see page 117, sect. 6, 2 nd paragraph and example at bottom of page 1 17 on 
right hand column) (e.g. A index number is added to the label of the original 
constituent and allows interpretation). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the fact extraction taught by 
Cunningham et al. with the identification of non-contiguous attributes taught by 
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Marcus et al.. The motivation to have combined the references involves the 
ability to represent sentences where complements of verbs occur after a 
sentenial level verb (see Marcus et al., page 117, sect. 6, 1 st paragraph), which 
would benefit the fact extraction tool taught by Cunningham et al. for recognizing 
discontinuous constituents. 
17. Claims 13, 17, 38, and 53 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Cunningham et al. in view of Feldman et al. (US 6,442,545). 
As to claim 13, 17, 38, and 53, Cunningham etal. discloses wherein, 

the text pattern recognition rule regular expression functionality (see page 
7, sect. 1.3.3., last two lines)and auxiliary definition (see page 82, 3 rd paragraph, 
and rule below) (e.g. From the cited portion a definition of GazLocation is given 
for a portion of the pattern.) 

However, Cunningham et al. does not specifically disclose the XPath - 
based functionality 

Feldman et al. does disclose the use of XPath-based (tree traversal , also, 
defined by the applicant, see Applicant's Specification, page 3, line 2) 
functionality (see col. 2, lines 15-22) (e.g. Hierarchical taxonomies are referred to 
and relationships are built). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the fact extraction taught by 
Cunningham et al. with the use of XPath functionality taught by Feldman et al.. 
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The motivation to have combined the references involves content-based and 
quantitative analysis of documents (see col. 2, lines 19-22). 

Conclusion 

18. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Carus (US 5,890,103) is cited to disclose a information retrieval by tokenizing 
text and assigning tags. Walker (US 6,279,017) is cited to teach extracting text specific 
attributes from machine readable text. Arnold et al. (US 6,910,003) is cited to disclose a 
searching of information based on concepts. Murata et aL (US 2002/0013694) is cited to 
teach a syntax analysis by using natural language patterns and seeing if it meets a tree 
structure. Simpson et al. (US 2003/0167162) is cited to teach identification of word 
patterns in a semantic network. Fass et a/.(US 2004/0078190) is cited to teach a 
information retrieval that matches concepts to user queries utilizing annotations. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Paras Shah whose telephone number is (571)270-1650. 
The examiner can normally be reached on MON.-THURS. 7:30a.m.-4:00p.m. EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick Edouard can be reached on (571)272-7603. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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